Types of search
When searching your EMu data, it is possible to:
- search for an exact phrase (Phrase search)
- find variations of a search term (Stemming)
- locate terms that sound like the search term (Phonetic search)
- make a search case sensitive
- perform range searches on dates, time and other numeric values, including latitude / longitude (Range search)
- locate records where two terms appear close together in a body of text (Proximity search)
- use wildcards (Pattern matching)
As we'll see, these various types of search can be combined in powerful ways.
EMu is Unicode compliant which means that it is possible, amongst other things, to search for punctuation and other special characters either as individual characters (?
) or as part of a more complex string (fred@global.com).
As we describe below, certain characters can have a special meaning when performing a search in EMu. A question mark can be used in place of a single character in a search term for instance: if we aren't sure whether an ise or ize spelling has been used (e.g. organise / organize), we could use a ?
in place of the s/z, i.e. organi?e to search for both words.
However, because it is possible to search for a question mark in its own right, we need to tell EMu when we want its special meaning to apply.
Applying the special meaning of a character is done by escaping it, which we do with a backslash \
.
For instance:
- A search for organi?e will attempt to locate the eight characters organi?e.
- A search for organi\?e takes advantage of the question mark wildcard and will locate both organise and organize.
In this section we describe in detail how to perform searches in EMu. The Unicode Cheat Sheet provides a quick reference guide to how special characters are specified in a search:
AND, OR, NOT (known as Boolean operators) can be used to make a search more focused, yielding more precise results. In the illustrations below, the green (light) area indicates the types of record that will result from a search using each operator:
|
A search for rock AND roll will return all records containing both words in the search field. It will find items about rock and roll music for instance. It might also find records that contain both words in a different context, such as As hard as he tried, he couldn't roll the rock away from the cave entrance. It will not return records in which rock appears without roll (and vice versa). Tip: If too many records are returned by your search, add another search term with the AND operator. |
|
A search for rock OR roll will return all records that contain either word in the search field. If a record includes both words, it will be returned by the search, but it only matters that one of the words is located. Tip: If too few records are returned by your search, add another search term using the OR operator. |
|
A search for rock NOT roll will return records that contain rock in the search field, but not if the search field also includes roll. Any record that includes roll in the search field will not be returned. Tip: Use the NOT operator to eliminate records which include a term. |
AND and NOT generally decrease the number of records found by a search; OR generally increases the number of records found:
If two or more terms are entered on the same line in a field, the Boolean operator AND applies:
In this example, records containing both words rock and roll anywhere, and in any order, in the Notes field will be found.
Tip: You are not limited to two search terms in an AND search, i.e. you could enter rock roll music and all three terms would need to be present in the Notes field.
If two or more terms are entered on different lines in a field, the Boolean operator OR applies:
A record will be returned as long as it contains at least one of the words, rock or roll in this example.
Tip: You are not limited to two search terms in an OR search; simply add a search term to another row.
NOT is specified by placing an exclamation mark !
before a search term. The NOT operator excludes records that have the following term in the search field.
NOT search | Description |
---|---|
|
Records that contain the word rock in the search field will not be returned by this search. |
\!rock roll
|
This search will return records that contain roll in the search field but only if rock does not also appear in the field:
Records with sausage roll in the search field will be returned; records with rock and roll in the search field will not be returned. |
The NOT operator can be applied to any of the other search operators.
NOT search | Description |
---|---|
\!\^Unknown\$ |
Return records that contain anything apart from the single word Unknown (see Pattern Matching). |
\!\"Not Applicable\" |
Return records that do not contain the phrase Not Applicable (see Phrase Search). |
\"\==Sacré \==Cœur\" \!Paris
|
Return records containing the phrase Sacré Cœur with case and diacritic significance but not if the searched field also includes Paris (see Case & Diacritics). |
NOT and NULL (an empty field)
It is important to be aware that a NOT search will not return a record where the search field is empty (has no value), even though logically an empty field does not contain the excluded term:
For example, records may have a Record Status of Retired or Active. Where every record has a value in the Record Status field, a search for records where Record Status is NOT Retired will return every record with a Record Status of Active. However, if a record has no value in the Record Status field (it is empty), it will not be returned by this search.
The technical explanation is that it is only possible to match a NULL value (an empty field) with the IS NULL or IS NOT NULL operators.
The solution is to combine a NOT search with a NULL search. This will locate every record that does not have a particular value in a field, including where the field is empty.
Thus, to locate every record that does not have Retired in the Record Status field (all records with a Record Status of Active as well as any record with no value in Record Status), perform the following OR search:
OR | Description |
---|---|
\!Retired OR
|
As we see in Pattern Matching: Using Wildcards, This search will return all records where the Record Status field does not include Retired OR it is empty. |
Many fields in EMu are text fields - names, addresses, titles, notes. The following text searches can be performed:
A phrase is one or more words adjacent to each other, e.g. business systems. In a Phrase search, quotation marks are used to define the phrase.
A Phrase search returns records that contain the search terms in the order in which they appear between the quotation marks:
Phrase |
Search results |
---|---|
\"business systems\" |
Records with the phrase business systems will be returned. A Phrase search is more precise than a Boolean AND search in which the words can appear anywhere and in any order in the search field. For example, an AND search for business systems would return a record containing the sentence "The business has many systems". Our example Phrase search would not return this record. |
|
A phrase search is not case or diacritics sensitive. For instance, in the first example, records with business systems, Business systems or Business Systems will be returned. And in this example, records containing the phrase Sacré-Cœur would be returned. Tip: See Case Sensitivity & Diacritics below for details about making a Phrase search case sensitive. |
With stemming it is possible to find records that contain variations / derivatives of a search term (e.g. plural, adjectival, conjugation, etc.).
Stemming is specified by placing a ~
(tilde) directly before the search term:
Stemming |
Search results |
---|---|
\~system |
system, systems, systematic, systematics, systemless, systematically, systematise, systemic, systemise |
\~view |
view, viewer, viewing, viewable, viewership, viewfinder, viewgraph, viewless, viewpoint |
\~bracket |
bracket, bracketed, bracketing |
A Phonetic search returns records that contain terms that sound like the search term. A Phonetic search uses the @
symbol:
Phonetic |
Search results |
---|---|
\@krystal |
crystal, krystal |
\@color |
colour, color |
\@smith |
smith, smyth, smythe |
Text searches are not case sensitive by default. A search for business will return records with the word business or Business, for instance. To make a text search case sensitive, use an equals sign = before the search term:
Case Sensitive |
Search results |
---|---|
|
This search will return records which contain Business but not business. |
|
This search combines a case sensitive search with a phrase search and will return Business system but not business system. |
As text searches are case insensitive by default, there is rarely a need to specify that a search is case insensitive. A situation in which it would be useful is where we specify a case sensitive Phrase search but want a word in the phrase to be case insensitive. We use \& to specify case insensitivity:
Case Insensitive |
Search results |
---|---|
\=\"This is about our Business \&Systems\" |
This search would locate both This is about our Business Systems and This is about our Business systems. |
Case AND diacritics matching is specified as \==
.
Search | Find |
---|---|
|
This search will return records with Sacré and Cœur exactly as specified, that is matching case and diacritics, but not necessarily next to each other. |
|
This example combines a Phrase search with case and diacritics matching. It will return records with Sacré Cœur exactly as specified, in the same order and matching case and diacritics. |
With Pattern matching it is possible to:
- Substitute one or more wildcard characters for one or more letters in a search term.
- Specify that a term appears at the beginning or end of a field.
Tip: Wildcard characters can be used alone, in combination with each other and with other types of search. See below for some examples of combined searches.
Wildcard |
Use |
Examples |
---|---|---|
|
Substitutes zero or more characters at its position in a search term. Tip: To return all records with something in the search field, that is, all non-empty fields , enter |
appl\* will match words starting with appl, e.g. apple, application, applied, etc. edit\* will match words starting with edit, e.g. edit, edits, edited, etc. |
|
Substitutes for any single character at its position in a search term. |
appl\? will match apply and apple (but not apples). organi\?e will match organise and organize. \?\?\? will match any combination of three graphemes (letters, digits, any character). |
|
Place at the beginning of a search term to specify that the term must display at the start of the field. |
\^Hospital will match Hospital for Patriots, and Hospital for Abandoned Animals (but not Patriots Hospital or Chicago Hospital). \^the will match records with text beginning with the word the. |
|
Place at the end of a search term to specify that the term must display at the end of the field. |
Organi\?ation\$ will match Funeral Directors Organisation and Funeral Directors Organization (but not Organisation of Funeral Directors). ?\$ will match records that have text ending in a question mark. |
|
Matches any one of a sequence of characters specified in any one of. any one of may consist of individual characters or a beginning and end character in a range may be specified separated by a minus sign (e.g. |
Organi\[sz\]ation will match Organisation and Organization. \[0-9\] will match any number between 0 and 9. |
|
Matches one or more of a sequence of characters specified in one or more of. one or more of may consist of individual characters or a beginning and end character in a range may be specified separated by a minus sign (e.g. |
Organi\{sz\}ation will match Organisation, Organization and (in the event of a typo) Organiszation. |
Some useful wildcard searches
Wildcard |
Description |
Example |
---|---|---|
|
This will only return records that have the single word term in the search field, i.e. term must be the start and end of the value in a field. Note: |
|
|
This will return records with nothing in the search field (an empty field). As we've seen, \* is useful if you want to return all records with something (anything) in the search field. \!\* means: "search for NOT anything (i.e. nothing) in this field". Note: See Search for an empty field: a NULL search for more details. |
|
|
Is useful when you are unsure of the search term's spelling. |
|
|
Use in an attachment field to return records that do NOT have any attachments. |
|
|
Use in an attachment field to return records that do have one or more attachments. |
When running a search in a numeric, date or time and latitude / longitude field, it is possible to search for a range of values (for instance, from 1 to 100, or between 1 January 1994 and 1 January 2005). Relational operators (>, >=, <, <=) are used to specify the lower, upper, or lower and upper limit of the date, time or number for which you are searching.
Tip: Alphanumeric data is handled a little differently to numeric data (date fields can take alphanumeric data; and values in latitude and longitude fields are by their nature alphanumeric, i.e. 51 39 00 N). Whenever you use alphanumeric data in a range search, the value must be enclosed in double quotes (e.g.\"51 39 00 N\" or \"1 January 1970\").
Tip: It is not necessary to escape relational operators (>, >=, <, <=).
Tip: It is not necessary to use an equals sign (=) to find a specific date or number.
Tip: It is possible to specify an upper and lower limit by combining range search terms. For example >=30 <=50
will find all records from 30 to 50 inclusive.
Operator |
Description |
Example |
---|---|---|
|
Finds records greater than the specified date or number. |
>\"2 Apr 1999\" will find records after 2 April 1999. >890 will find records greater than 890. |
|
Finds records less than the specified date or number. |
<\"2 Apr 1999\" will find records before 2 April 1999. <890 will find records less than 890. |
|
Finds records greater than or equal to the specified date or number. |
>=\"2 Apr 1999\" will find records after or from 2 April 1999. >=890 will find records greater than or equal to 890. Note: For reasons explained below in A warning about partial range searches, it is best to avoid the use of >= in a partial range search (e.g. with an incomplete date or latitude / longitude). |
|
Finds records less than or equal to the specified date or number. |
<=\"2 Apr 1999\" will find records before or equal to 2 April 1999. <=890 will find records less than or equal to 890. Note: For reasons explained below in A warning about partial range searches, it is best to avoid the use of <= in a partial range search (e.g. with an incomplete date or latitude / longitude). |
When performing a partial range search, it is important to avoid use of = with the relational operators (< or >).
Consider this query using a partial date:
AdmDateModified >= 2012
There are effectively two requests here:
- Locate records with a date that is greater than 2012.
- Locate records that equal 2012.
The first search will return all records where AdmDateModified is from 1 January 2013, which is as expected.
However, the second part will always return no matches. A date recorded in EMu comprises three parts (day/month/year), but some parts of the date in our search term are not unknown - in this case the day and month are technically NULL (although it might be easier to understand NULL
as UNKNOWN). What we intend by our search is that any record with a year of 2012 is returned, but when we use the equals modifier (=) we're actually requesting that the date recorded in AdmDateModified exactly matches NULL/NULL/2012 (or UNKNOWN/UNKNOWN/2012). This will of course match nothing.
A reliable method for capturing all records within a year where day and / or month may be missing is simply to search by year. For instance, to find all records created in 2019 or later, you could search for:
AdmDateModified >2018
This will return all records from 1 January 2019 onwards.
Latitude / Longitude
A similar issue arises with partial DMS latitude and longitude values.
Consider this query using a partial latitude:
LatLatitude_nesttab >=\"34 30 S\"
There are effectively two requests here:
- Locate records where latitude is greater than 34 30 S.
- Locate records where latitude equals 34 30 S.
The first search will return all records with a latitude north of 34 30 S.
However the second part will in fact always return no matches. As with dates, the reason is that you cannot have a partial latitude EQUAL anything as some parts of the DMS latitude are unknown (in this case seconds is NULL).
Again, the solution is to avoid using <= and >= with partial ranges. The query:
LatLatitude_nesttab >\"34 31 S\"
would return records with a latitude equal to or north of 34 30 S.
With a Proximity search it is possible to search for words in a text field and specify the number of words, sentences or paragraphs separating them. A Proximity search works on the principle that the proximity of two terms is suggestive of their association with each other. This can be useful when searching fields with large bodies of text.
Proximity searches take the form of:
\'\(term1 term2\) ordered modifier operator number unit\'
where:
term1 term2 |
are the two terms to search for, enclosed in brackets, e.g.:
\(state hospital\) |
ordered |
is optional. If given, the search terms must be in the order specified. |
modifier |
is either:
-OR-
|
operator |
is a relational operator (<, <=, >, >=, =). Note: Relational operators do not need to be escaped in a search. |
number |
is a number or range to indicate the proximity or proximity range of each term to the other, e.g. |
unit |
is the unit of proximity by which to search, e.g. word, sentence or paragraph:
\(state hospital\)<= 5 words will find records where the search terms state and hospital are equal to or less than five words apart. Note: The unit of proximity can be abbreviated to the first letter of the word, e.g. w for |
The entire search phrase is enclosed within single quotes, e.g.:
\'\(state hospital\) <= 5 w\'
You want to find Bibliographic records about the ethics of cloning. You type clon\* and ethic\* in the Notes field and run a search. The following record is one of the matches:
Even though the two terms do display in the same record, the terms appear quite far apart, and while the record does discuss cloning and does mention ethics, it is not about the ethics of cloning.
To further refine the search, you now run a proximity search. This search specifies that two terms must display within five words of each other:
\'\(clon\* ethic\*\) <= 5 words\'
This time the above record is not found because the search terms are too far apart. However, this search does find the following record because the two terms display within five words of each other. The probability that this record is on the subject of ethics and cloning is much higher, and indeed this is correct:
Various types of search can be combined in powerful ways. For example:
Find | Search |
---|---|
Records with at least one word containing a capital |
\=\*S\*
|
Records where Fred occurs case significantly in the same sentence as the phonetic of Smith where Fred appears first. |
|
Records where the kanji character 豈 appear within 5 characters of the phrase 香港. |
|
Records that contain anything apart from the single word Unknown. |
\!\^Unknown\$
|
Records that do not contain the phrase Not Applicable. |
\!\"Not Applicable\"
|
Records containing the phrase Sacré Cœur with case and diacritic significance but not if the searched field also includes the word Paris. |
\"\==Sacré \==Cœur\" \!Paris
|
Records containing either an upper case or lower case |
|
Records with text where the first word starts with a lower case Latin letter. |
|